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TENSOR  ANALYSIS  OF  ANOVA  DECOMPOSITION 

1.  Introduction.  j 

§1  Introduction. 

The  purpose  of  this  paper  is  to  demonstrate  the  almost  complete  analogy  be¬ 
tween  ANOVA  for  general  n-factor  crossed  layouts  and  the  ANOVA  type  decomposition 
of  square  integrable  statistics  used  in  the  literature  on  ([/-statistics  and  in  connection  with 
the  Jackknife  estimate  of  variance  (Efron  and  Stein(1981),  Bhargava(1980),  Karlin  and 
Rinott(l982)).  This  will  be  done  using  notions  and  notations  of  tensor  analysis  and  multi¬ 
linear  algebra.  It  will  be  clear  that  the  latter  is  just  an  infinite  dimensional  generalization 
of  the  former.  Usually  the  analogy  between  the  two  is  understood  in  an  operational  sense, 
namely  how  the  higher  order  interaction  terms  are  defined  by  an  inclusion-exclusion  ar¬ 
gument.  Various  subclass  means  are  added  and  subtracted  in  the  usual  ANOVA;  various 
conditional  expectations  in  the  second  case.  Then  the  orthogonality  of  the  interaction 
terms  are  proved.  For  ANOVA  Mann(1949,  Chapter  5)  gives  a  classical  treatment.  See 
also  IIan(1977)  for  a  treatment  in  modern  terminology.  By  using  tensor  analysis  we  can 
in  a  sense  reverse  the  argument.  Orthogonal  subspaces  of  an  appropriate  vector  space 
can  be  directly  described.  Only  the  dimensionality  is  different  in  the  two  cases.  The 
inclusion-exclusion  pattern  then  follows  from  the  form  of  the  orthogonal  projectors  onto 
these  subspaces. 

In  ANOVA  and  experimental  design  tensor  approach  has  been  employed  by  a 
number  of  people.  Tt  provides  a  natural  and  powerful  tool  for  treating  general  n-factor 
crossed  layouts  and  other  designs.  Unfortunately  terminology  and  notation  were  not 
standardized,  in  particular  an  essentially  same  notion  has  been  called  tensor,  Kronecker, 
direct,  or  outer  product.  Approaches  employed  were  sometimes  elementary,  sometimes  more 
abstract.  This  is  one  of  the  reasons  why  this  approach  has  not  been  very  often  taught. 

In  the  field  of  experimental  design,  Kurkjian  and  Zelen(l962)  introduced  a  “calculus 
for  factorial  arrangements”.  Following  this  work  there  have  been  many  papers  using  direct 
product  notation  for  construction  and  analysis  of  various  designs,  including  Kurkjian  and 
Zelen(1963),  Zefen  and  Federer(1964,1965),  Federer  and  Zelen(1966),  Bock(1963),  Paik  and 
Federer(l974),  Cotter,  John,  and  Smith(1973),  Cotter  (1974,1975),  John  and  Dean(1975a,b). 
The  terminology  and  notational  conventions  introduced  by  Kurkjian  and  Zelen(1962)  seem 
to  be  rather  arbitrary.  Connection  between  their  “calculus”  and  the  standard  tensor 
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analysis  or  multilinear  algebra  was  not  made  clear.  Another  drawback  is  that  they  confined 
their  theory  to  the  usual  matrix  theory  and  multilinear  aspects  tend  to  be  lost.  For  example 
they  define  direct  product  of  matrices  as  a  partitioned  matrix  of  a  larger  dimensionality 
(this  is  still  a  common  practice  today  in  statistics).  But  this  introduces  an  unpleasant 
ordering  of  indices  and  the  symmetry  inherent  in  the  problem  becomes  obscured. 

Another  group  of  people  employing  this  technique  are  found  in  the  coordinate- 
free  approach  in  linear  models,  for  example  Jacobsen(1968),  Eaton(1970),  Haberman(1975). 
Jacobsen(1968)  seems  to  be  the  first  systematic  treatment  of  ANOVA  from  the  viewpoint 
of  multilinear  algebra.  In  addition  to  the  new  viewpoint  his  treatment  of  the  nested  model 
and  the  missing  observation  method  is  interesting.  Unfortunately  his  results  do  not  seem 
to  have  been  published  in  a  more  widely  available  form  and  has  been  almost  forgotten 
in  the  later  literature.  Furthermore  his  treatment  suffers  from  excessive  mathematical 
formalism  and  arbitrary  notational  conventions.  Later  IIaberman(  L975)  gave  a  thorough 
treatment  which  can  be  regarded  as  a  standard  reference  so  far.  One  problem  with  these 
mathematical  treatments  is  that  an  essentially  elementary  nature  of  the  approach  and 
practical  computational  aspects  are  often  difficult  to  grasp. 

In  Section  2  we  define  tensors  as  multidimensional  arrays  as  in  the  usual  tensor 
analysis  (Sokolnikoff(  1964),  Chapter  2).  By  doing  this  the  unpleasant  ordering  of  indices 
mentioned  above  is  avoided.  Operations  on  these  arrays  are  explicitly  described.  In  any  high 
level  computer  language  multidimensional  arrays  can  be  used  as  easily  as  matrices,  so  this 
approach  can  be  immediately  incorporated  in  computer  programs.  Standard  terminology 
of  tensor  analysis  and  multilinear  algebra  will  be  employed.  Furthermore  we  develop  the 
theory  in  such  a  way  that  it  can  be  easily  generalized  to  L2-spaces. 

In  Section  3  we  briefly  look  at  the  general  n-factor  crossed  layout. 

In  Section  4  we  treat  the  ANOVA  type  decomposition  of  a  statistic  with  finite 
second  moment  by  generalizing  the  results  of  Section  2  and  3  to  L2-spaces.  The  decom¬ 
position  was  first  introduced  by  Hoeffding(1948)  in  connection  with  {/-statistics.  Often 
the  linear  terms  of  this  decomposition  (corresponding  to  the  main  effects  in  ANOVA)  are 
called  Hajek  projection  following  Hajck(1988)  and  used  extensively  to  prove  asymptotic 
normality  of  various  statistics.  See  Serlling(1980)  for  further  references.  Recently  more 
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attention  is  paid  to  the  full  decomposition.  Rubin  and  Vitale(l980)  developed  a  general 
asymptotic  theory  of  [/-statistics  using  the  full  decomposition.  Efron  and  Stein(1981)  used 
the  decomposition  in  their  study  of  the  Jackknife  estimate  of  variance.  Further  results 
and  generalizations  are  given  in  Bhargava(1980)  and  Karlin  and  Rinott(1982).  In  the  ab- 
sense  of  a  standard  reference  for  the  decomposition  each  of  them  gave  a  definition  of  the 
decomposition  using  various  variations  of  an  inclusion-exclusion  argument.  Our  approach 
is  different  from  these  as  mentioned  at  the  beginning. 

§2  Tensor  products  of  vectors,  matrices,  vector  spaces  and  subspaces. 

In  this  section  we  develop  a  theory  of  tensors.  Particular  references  used  in  this 
section  are  Greub(1978),  Chapter  1  and  SokolnikolT(  lfifvl),  Chapter  2.  A  full  abstract 
treatment  can  be  found  in  Chapter  1  of  Greub(1978).  In  a  pure  mathmatical  treatment 
tensors  are  developed  in  a  coordinate-free  way  (Greub(l978)).  This  is  elegant  but  not 
desirable  from  the  viewpoint  of  computational  applicability  in  statistics.  On  the  other 
hand  the  traditional  tensor  analysis  (SokoinikolT(  19(H))  is  more  practical  but  is  too  closely 
tied  to  physics  and  much  emphasis  is  placed  on  curvilinear  coordinates  which  we  do  not 
need  here.  We  take  appropriate  notions  and  notations  needed  from  both  of  them.  Proofs 
can  be  found  in  various  references  given  above  and  hence  omitted  below  except  for  a  few 
places. 

Let  Rm  be  the  set  of  all  column  vectors  x  —  (xl, . . . ,  xm)'  with  rn  elements  of 
real  numbers.  To  denote  the  components  of  a  column  (or  contravariant)  vector  we  use 
superscripts  following  the  traditional  notation  in  tensor  analysis.  Vector  addition  and  scalar 
multiplication  are  defined  in  the  usual  componentwise  way.  Now  tensor  ( Kronecker ,  direct , 
outer)  product  x  (g)  y  of  x  (£  Rm)  and  y  (£  Rn)  is  a  two-dimensional  array  defined  by  a 
componentwise  multiplication: 

(2.1)  (x  ®  y)13  —  x1  •  y3 . 

Namely,  x  0  y  is  a  two-dimensional  array  of  dimensions  m  and  n  whose  (i,j)  element  is 
x'y 3  (*  =  1  j  =  l,...,n).  Now  we  define  an  addition  of  tensor  products  in  a 

componentwise  way. 

(2.2)  (ax  ®y  +  bx(g)  y)13  —  ax^y3  +  bx'y3 , 


where  a,  b  are  scalars.  This  leads  to  a  vector  space  generated  by  {  x  0  y,  x  £  Rm,  y  6 
Rn  }  which  we  denote  by  Rm  0iin.  Namely 

Rm  0  Rn  =  span{  x(g)y,  x  £  Rm,  y  £  Rn  } 

ai(x  0  y)>  k:  finite}. 
i  *’ 

Here  the  index  i  is  written  directly  below  the  corresponding  vectors  x  and  y  because  usual 
subscripts  are  used  as  covariant  indices  in  tensor  analysis.  This  point  will  be  discussed  later 
in  this  section  in  connection  with  linear  transformations.  Rm  0  Rn  is  called  the  tensor 
product  of  Rm  and  Rn .  A  general  element  u  £  Rm  0  Rn  is  called  simply  as  a  tensor. 

As  one  might  expect,  Rm  0  Rn  is  just  the  set  of  all  two-dimensional  arrays  of 
dimensions  to  and  n.  We  will  make  this  point  clear  in  a  couple  of  propositions. 

Lemma  2.1.  x  0  y  is  bilinear  in  x  and  y.  Namely 

(ax  +  bx)®y  =  a(x  0  y)  +  b(x  0  y ), 

(2.4)  x  0  (cy  +  dy)  —  c(x  02/)  +  d(x  0  y), 

where  a,  b,  c,  d  are  scalars. 

Let  e  denote  a  vector  in  Rm  whose  t-th  element  is  one  and  other  elements  are 

l 

zero,  {e ,  i  =  1, . . . ,  to}  forms  an  obvious  basis  of  Rm.  Now  consider  ?0e  which  has  1  in 
i  i  j 

(h imposition  and  0  everywhere  else.  Then 

Proposition  2.1.  {e0c,  *  =  1,  ...,m,  j  =  1, . . .  ,n}  is  a  basis  of  Rm(g)Rn. 

i  j 

hence 

Corollary  2.1.  dim(/?Tn0 ft”)  =  rnn  and  ftm0/2n  coincides  with  the  set  of  all  two- 
dimensional  arrays  of  dimensions  m  and  n. 

Remark  2.1.  An  element  u  £  Jf?m0Z?Tl  which  can  be  written  as  u  —  x®y  for  some 
x  £  Rm,  y  £  Rn  is  called  decomposable.  Rm(&Rn  does  not  consist  only  of  decomposable 
elements.  This  is  easily  seen  by  noting  that  :c02/  is  of  “rank  1”  in  the  terminology  of  the 
usual  matrix  theory. 
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In  Rm  we  have  the  usual  inner  product.  In  Rm  ®/?n  a  natural  inner  product  is 
defined  in  an  analogous  way.  Let  u,  v  £  Rm<g)Rn.  Then  we  define 

m  n 

(2.5)  [u,  v)  =  y^.  y^  ui3v%3 . 

»= i  ]=i 

Proposition  2.2.  Por  two  decomposable  elements  x®x,y®i/  of  Rm(g)Rn  we  have 
(2-6)  (x®x,2/®y)  —  (x,t/)  •  (x,jj). 


Proof: 


{x<g>x,ylg)y)  —  y^(x®x)‘>  •  (2/®y)l’y 

=  y^  x'&y'y3 

*.y 

=  ^  ^ 

*  y 

=  (*»»)  ■  (*>i/). 


B 


Now  we  proceed  to  define  tensor  products  of  more  than  two  vectors.  Let  x  £ 
=  1  Th 

such  that 


Rm ' i  *  —  Then  x®  •  •  •  ®x  is  defined  to  be  a  fc-dimcnsional  array  of  dimensions 


(2.7) 


(x®.  •  •  ® *)»*—*  =  xl‘  -  •  •  X1*. 


'1  k’  1  * 

Addition  is  defined  componentwise  and  the  space  generated  by  {  x®-  •  -®x,  x  £  Rmi ,  i  = 

1  k  i 

lj  •••,&}  is  called  the  tensor  product  of  Rmi , . . . ,  Rmk  and  denoted  by  Rmi  ®  •  •  • 
o r0?=ijRm<-  Lemma  2.1,  Proposition  2.1,  Corollary  2.1  hold  for  k  >  2  with  obvious 
modifications.  Now  for  general  clement  u,  v  of we  define  the  natural  inner  product 

by 

7711  mfc 

(2.8)  («,v)  =  y^  •••  y^  «**•••' *fcwi1  •••**.. 

*i=i  *fc= i 

Then  analogous  to  (2.6)  for  two  decomposable  elements  x®-  •  •  ®x,  2/®-  •  •  ®2/  of  ®^=1I2m< 

1  fc  1  k 

we  have 


(2.9) 


(x®-..®x,?;®..-®2/)  =  (z,  »)•(*,  »)•••(*,»). 


6 

Remark  2.2.  If  x  and  V  are  orthogonal  for  some  i,  then  a;®-  •  •  ®®  and  2/®*  •  •  ®27  are 
i  i  1  k  1  k 

orthogonal. 

Next  we  consider  subspaces  and  its  orthogonal  complements.  Let  U\t . . .  ,Uk  be 

subspaces  of  Rmi , . . . ,  Rmk  respectively.  Then  a  subspace  f/i®  •  •  ®t/fc  of  Rm'  ®  •  •  ®/?mfc 

is  defined  to  be  the  subspace  generated  by  {  x®  •  •  •  ®z,  x  £  Ui,  i  =  1, . . . ,  k  }.  Namely 

1  hi 

(2.10)  I/i®  •  •  •  ®  t/*;  =  span{  x®  •  •  •  ®x,  x£Ui,  »  =  1, ...,&}. 

Let  U±~  denote  the  orthogonal  complement  of  Ui  in  Rmi.  For  convenience  we 
define  U®  =  Ui,  U\  =  U^-.  Then  we  have 

Theorem  2.1  2fc  subspaces  {®J=1C/^,  £,•  =  0,  t,  i  —  1, . . . ,  k  }  form  a  decomposi¬ 

tion  of$Qi=lRmi  into  mutually  orthogonal  subspaces. 

This  is  clear  by  taking  appropriate  orthonormal  basis  of  Rmi,  i  =  1  and 

applying  Remark  2.2. 

Corollary  2.2. 

(2.11)  (fA®-  •  •  ®t4)-L  =  span {®i_if/iS  Ci  —  l  for  some  i}. 

Now  we  arc  going  to  define  tensor  product  of  matrices.  An  n  X  m  matrix  A  is 

considered  to  represent  a  linear  transformation  from  Rm  to  Rn.  In  this  sense  we  want  to 

distinguish  matrices  from  two-dimensional  arrays  (elements  of  /?n®Rm).  In  tensor  analysis 

this  is  done  by  writing  the  second  index  as  subscripts.  Namely  (t,  j)  element  of  a  matrix 

A  is  denoted  by  Alj.  Superscripts  are  called  contravariant  indices  and  subscripts  as  called 

covariant  indices.  The  reason  behind  this  is  discussed  in  Remark  2.4  below.  Now  let  A 

i 

be  m  X  TWj  matrices,  i  =  l,...,k.  We  want  to  define  a  tensor  product  of  A,..., A  in 

1  k 

a  meaningful  way.  For  notational  convenience  wc  first  do  this  for  the  case  k  =  2.  For 
matrices  A  («i  X  mi)  and  B  («2  X  m2)  we  define  A®i?  as  a  four-diinensional  array  with 
two  contravariant  indicies  i  \ ,  and  two  covariant  indices  ji,  J2  such  that 

(2-12)  (A®B)V^=AV-/^. 

This  is  again  a  componentwise  multiplication  as  in  (2.1).  Now  A®/?  defines  a  linear 
transformation  from  72mi  ®i?m2  to  72n'®i2"2  as  follows.  Let  u  (E i?mi®/2OT2  then  v  — 
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(A®i?)uG  72ni®i?n2  is  defined  by 

mi  r»i2 

”i‘”  =  E  E  <*»*&*»** 

(2-13)  VC 

=  EE4'4’«’'"' 

ji  =  l  j'a  =  1 

As  in  (2.6)  wc  have 

Proposition  2.3.  For  a  decomposable  element  x®y  of  i?”11®/?”*2 

(2.14)  (A®/.?)(x®i/)  ==  Ax®  By. 

Remark  2.3.  We  could  have  used  (2.14)  as  a  definition  or  A®  J3.  It  shows  how  A®B 
maps  decomposable  elements  of  Km‘(g/ra.  Since  the  decomposable  elements  generate 
Rmi(g)Rm 2,  A®B  for  general  elements  can  be  defined  by  linearity.  This  is  more  elegant 
mathematically  but  for  practical  applications  formula  (2.13)  will  be  useful.  The  same 
remark  applies  to  Proposition  2.2. 

Generalization  of  the  above  argument  to  tensor  product  of  more  than  2  matrices 
is  immediate.  Instead  of  (2.13)  and  (2.14)  we  have 

(2.15)  (A®-..0A)V-‘*  =  £  AV-.AVy-*, 

it . A1  k 

(2.16)  (A®--  -  ®A)(x®-..  ®x)  =  Ax®-  --Ax, 

1  A;!  /ell  k  k 

respectively. 

Remark  2.4.  Rm  was  defined  as  the  set  of  column  or  contravariant  vectors.  The  dual 
space  Rm*  can  be  defined  as  the  set  of  row  or  covariant  vectors  whose  components  are 
denoted  with  subscripts  x  =  (xi,...,xm).  Then  Rm*®Rn*,  Rm®Rn*,  etc.,  can  be  defined 
in  a  similar  way  as  -Rm®iin  is  defined  by  (2.1)-(2.3).  By  the  natural  isomorphism  between 
Rm®Rn*  and  the  space  of  all  linear  transformations  from  Rn  to  Rm  (  see  Greub(1978), 
Section  1.28)  a  linear  transformation  A  can  be  identified  with  an  element  of  Rm®l?n*  and 
has  one  contravariant  index  and  one  covariant  index. 
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Our  last  item  in  this  section  is  a  discussion  on  orthogonal  projectors.  Let  V  be 
a  vector  space  and  U  C  V  be  a  subspace.  A  linear  transformation  Pu  from  V  to  itself  is 
called  the  orthogonal  projector  onto  U  if 

Pux  =  x  for  x  £  U, 

(2.17)  Pvx  =  0  for  x£U-L. 

Theorem  2.2.  Let  UiCRmi ,  i  =  1, ...,«  be  subspaces  and  Pui  be  the  orthogonal 

projectors  onto  Ui  in  Rmi  ,i  —  1 . n.  Then  the  orthogonal  projector  onto®)?==lUiC®i==iRmi 

is  given  by  Pu,  ®  •  •  •  ®Pun  • 

Proof:  Let  x  £  Ui,  i  =  1, . . . ,  k.  Then  by  (2.16) 
i 

{Pu,®-  -  ■  ®PuJ(f®- ■  ■  ®a)  =  Pu ,f®--  ■  ®Ruk* 

(2.18)  ' 

Hence  for  general  elements  u  of  Ui<®- •  ■  ®Uk  we  have  {Pu,®- -  -  ®Puk)u  —  u  by  linearity. 
Now  by  Corollary  2.2  {Ui®- •  •  ®Uk)1  is  generated  by  {  x®- ■  •  ®x,  x  £  for  some  i  }. 

For  such  x<® •  •  •  ®>x 

i  ^  fc 

{Pul®“-®Ruk){x®---®%)  =  Ru,x®---®Puk% 

(2.19)  =  0. 


Hence  by  linearity 


B 


{Pu,  ®---  ®Puk){Ul  ® •  •  •  <g> t/fc)-L  =  {  0  }. 


See  Haberman(l975),  Lemma  8,  for  an  alternative  proof  using  the  fact  that  Pu,® 
•  •  •  ®Ruk  *s  idempotent  and  self-adjoint. 

§3  ANOYA  for  crossed  layouts. 


Now  we  take  a  brief  look  at  ANOVA  for  an  n-factor  crossed  layout  with  single 
observation  per  cell.  For  more  detailed  treatments  of  various  designs  see  the  references 
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given  in  Section  1.  For  each  combination  of  factor  levels  (ilt . . .  ,in)  we  have  an  observation 

i*1"’".  Therefore  the  set  of  observations  {a:*1"  *"}  can  be  considered  as  a  (random)  tensor 

x  G<S>?=  iRmi  ANOVA  is  essentially  a  decomposition  of0J*=1.ftm<  into  mutually  orthogonal 

subspaces.  When  all  interactions  are  considered  it  is  decomposed  into  2n  subspaces.  Usually 

this  is  done  by  an  inclusion-exclusion  argument.  Here  we  give  the  desired  decomposition 

directly  as  follows.  Let  1  be  a  vector  in  Rmi  with  all  components  equal  to  1.  Let 
m« 

Ui  =  span{  1  }  and  consider  the  decomposition  of  <8F=i in  Theorem  2.1.  We  use 
the  notational  convention  of  Theorem  2.1.  Following  Scheffe(l959),  Section  4.6  let  1,,...** 
denote  the  (*i, . . . ,  ik)  -interaction  subspace  for  1  <  <  ...  <  ik  <  n,  k  —  0, . . . ,  n.  We 

claim  that 

(3.1)  Lil...ik=UV®---®Utnn, 

where 

e*  —  1  */  »  €  { *i, ... , ik  }, 

=  0  otherwise. 


This  can  be  shown  by  considering  the  orthogonal  projector  onto  the  right  hand 
side  of  (3.1).  Note  that  the  orthogonal  projector  onto  [/,•  =  span{  1  }  is  given  by  (in  matrix 
form) 


(3.2) 


1 

F  =  —1  1  . 
*  m,- 


For  x  —  (x  ,...,xmiY  we  have  Fx  =  (x,  ...,x)'.  Furthermore  I  —  F  is  the  orthogonal 

.  |  mi  *  ’ 

projector  onto  Uf-,  where  I  denotes  the  m,-  X  rrii  identity  matrix.  Now  by  Theorem  2.2 
the  orthogonal  projector  onto  the  right  hand  side  of  (3.1)  is  given  by 


(3.3) 

where 


1  k 


Q  —  I  ~F  */ *  G 

t  1 

=  F  otherwise. 
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For  example  consider  Pi%: 

TO  j  TO  2 

i>12=(/  -f  )<S>U  “f)0f  0-"0P 

1  ^  O  71 

TO  1  TO  2  TO  2 

(34)  =/  0/  0P0---0P-P0I  0P0---0P 

-  I 1 0P0 P(g) •  •  •  ®  F  +  P0P0P0 •  •  •  0p. 

2  3  n  1  2  3  n 

Operating  P12  to  2:  we  obtain  the  usual  expression: 

(3.5)  (jPi2*)** =  x*‘ia-  -  x,v"  -  xlV'“  +  x . . 

Note  that  the  expansion  of  the  expression  for  the  projector  leads  to  the  inclusion-exclusion. 
This  pattern  should  be  clear  for  general  Pix...ik-  This  proves  (3.1). 

The  sum  of  squares  due  to  (tj, . . . ,  i*)-  interaction  for  an  observed  tensor  x  is  given 
by 

(3.6)  S{ |  ...t*.  —  (P*t. ..»**»  P%l...ikx) 

Note  that  the  actual  computation  of  (3.6)  can  be  done  using  (3.3), (2. 15)  and  (2.8). 

The  degrees  of  freedom  (d.f.)  of  (t'i, . . . ,ifc)-interaction  is  given  by  dim(jCi,...ifc). 
Noting  that  dim(C/,)  ='  1  and  dim(f/^~)  =  mi  —  1  we  obtain  by  Corollary  2.1 

k 

(3.7)  d.f.  of  (ij,  ...,»*)  —  interaction  =  JJ  (mty  —  1). 

i= 1 


§4  AN OVA  decomposition  of  a  statistic. 

In  this  section  we  study  the  ANOVA  type  decomposition  of  a  square  integrable 
statistic  5(xi, . . . ,  Xfc).  For  this  purpose  we  extend  the  results  in  the  previous  sections 
to  L2-spaccs.  Particular  references  used  here  are  Maurin(.l967),  Section  3.10  and  Murray 
and  von  Neumann(1936),  Chapter  2.  Let  (Xi,  fix),. . .  ,[Xn,  /j.n)  be  probability  spaces.  We 
consider  the  L2-space  of  the  product  probability  space  (Xi,,/ztl)X- •  •  X(Xtfc, mk): 

(4.1)  L  (Xll,...,Xtfc)  =  {  ^(x,-, , . . . ,  xt-fc)  |  J  (f>  fiil{dxil)'  •  •  fiik[dxik)  <  00}. 


4.  ANOVA  decomposition  of  a  statistic. 
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Note  that  L2{Xilt . . .  ,Xik)CL2(X]l, . . .  ,X3l)  if  {*t,...,*fc}C{ji,...,j*}.  For  simplicity 
we  assume  that  ...,  Xn  are  locally  compact,  separable,  metrizable  spaces  so  that  L2- 
spaces  in  (4.1)  are  separable.  See  Dieudonne  (1976),  Chap.  13. 

For  notational  convenience  let  n=2,  general  case  being  an  obvious  modification. 
Let  <f>(x i)  €  L?(X j),  i/)(x 2)  £  L2(X2).  Intuitively  we  can  think  of  as  having  continuous 
indices  xi,x2.  Sum  of  squares  is  replaced  by  squared  integrals.  Now  define  by  a 

componentwise  multiplication: 


(4.2) 


{<j>®^){xi,x2)  —  (j)(x  1)  -^(x 2)  e  L2{X  1,X2). 


Note  that 


(4.3) 


J  li\{dxy)iji2{dx2) 

—  J  4>{xi)2 m[dxi)  J  tp{x2)2 /j,2[dx2) 


<  00. 


Hence  €  L2{X i,X2). 

Now  let  L2(Xi)(g)L2{X2)cL2{Xi,X2)  be  defined  as  in  (2.3),  namely 
L2(X1)®L2(X2)  =  span{  4>  6  L2(I,),  V-  G  L2(X2) } 


(4.4) 


=  closure  of  {  k  :  finite}. 


»'=  t 


Proposition  4.1.  I2(Xi)<^L2(X2)  =  L2(X i,X2). 

This  is  a  standard  construction  (Maurin(1967),  Example  of  Section  3.10)  and  a 
simple  consequence  of  the  following  well-known  result. 

Lemma  4.1.  Let  {^>i,<f>2, . . .  }  and  {tl’i,ip2, . . .  }  be  complete  orthonormal  systems 
of  L2(Xi),L2(X2)  respectively.  Then  {faipj,  i  =  l,2,...,j  =  1,2,...  }  is  a  complete 
orthonormal  system  of  L2(X i,X2). 

For  a  proof  of  this  see  Murray  and  von  Neumann(1936),  Lemma  2.2.1  or  Courant 
and  Hilbert  (1937),  Sec  II.1.6.  Lemma  4.1  shows  that  as  in  Lemma  2.1  and  Corollary  2.1 
decomposable  elements  of  the  form  fc  •  ipj  generate  the  whole  L2(Xi,X2)  space. 


IS 


Then 


(4.5) 


Now  let  us  take  a  look  at  the  inner  product.  Let  0 i,  02  G  I2{X i),  0i,  02  E  L2(X 2)’. 
(01001,  02  0  02) 

=  J  </>j.{x1)'tp1(x-i)<l>2lxi)'ll)2[x2)lH{dxi)ti2{dX2) 

—  J  <l>l{xi)<l>2{xi)lJ'l{dxi)  J  ‘ll>l{x2)i>2{x2)ll2{dX2) 

—  (<£l,  <£2)  •  (01,02)- 


This  is  the  same  relation  as  in  Proposition  2.2.  We  see  that  the  inner  product  of'  L2(Xi,X2) 
corresponds  to  the  inner  product  introduced  to  Rm  0  Rn  in  Section  2.  Therefore  all 
orthogonality  relations  of  Section  2  can  be  translated  here.  In  particular  Theorem  2.1 
can  be  generalized  as 

Theorem  4.1.  Let  Ui  be  a  closed  subspace  in  I2{Xi ),  i  =  1, ...,».  Let  C/ 1  0-  •  •  (£)Un 
be  defined  as  a  closed  subspace  generated  by  { 0i  0  *  •  •  0  0«,  0,  G  Ui,  *  ”  1, 

Let  U°  —  Ui,  U}  =  Ui~  for  convenience.  Then  2"  subspaces  {0"=1 U\% ,  et-  = 
0,1,  i  =  1,  ...,n}  form  a  decomposition  of  L2(X  it...,Xn)  into  mutually  orthogonal 
closed  subspaces. 

Now  let  Fi  :  t2{Xi )  ->  L2{Xi),  i  —  1,2,  be  bounded  linear  transformations.  We 
define  a  linear  operator  (easily  seen  to  be  bounded)  l'\ 0 /' 2  :  I2{X i,X2)  — >  L2(Xi,X2)  by 

(4.6)  (Fi®F2)(01(g)02)  =  ^1010/^202 

for  decomposable  elements  and  extend  by  linearity.  See  Remark  2.3.  For  a  further 
justification  of  this  see  Murray  and  von  Neumann(1936). 

Next  wc  consider  orthogonal  projections.  Note  that  the  definition  ol  orthogonal 
projector  in  (2.17)  is  independent  of  the  dimensionality.  Therefore  with  the  same  proof  for 
Theorem  2.2  we  have 

Theorem  4.2.  Let  f/.C/^I;),  *  =  1 ,...,»  be  closed  subspaces  and  PU{  be  the 

orthogonal  projectors  onto  Ui  in  I2{Xi),i  —  l,...,n.  Then  the  orthogonal  projector  onto 
0”=  xUiCL2(Xi,...,Xn)  is  given  by  PUl  0  •  •  •  0  Pu„  ■ 


4.’  ANOVA  decomposition  of  a  statistic. 


IS 


Now  let  us  come  back  to  ANOVA  type  decomposition.  Let  li(act)  =  1  6  L2(Xi) 
and  Ui= span{  1,  }CL2(X,).  LetF»  be  a  linear  transformation  corresponding  to  taking  the 
mean.  For  <f>  G  I2(Xi) 


(4.7) 


Fi4> 


=  £<!>  =  {£<j>) U  g  L2{Xi). 


Then  F,Tt-  =  lt  and  F{<f>  =  0  for  <£  G  L2(Xt)  such  that  (U,  4>)  —  0.  Therefore  F,-  is  the 
orthogonal  projector  onto  Ui  and  Pui  =  Fi.  Denoting  the  identity  map  of  I2{Xi)  by  /;  we 
have  Pus_=Ii  —  Fi.  Now  we  define  Lit...ik  by  (3.1)  and  Pil...ik  by  (3.3)  with  /;,  Ft-  replacing 


To  see  how  behaves  we  fix  complete  orthonormal  systems  { (j)\,  <f>\, . . .  } 

of  l2{Xi),i  =  1  such  that  <f>\  =  1  i,  i  —  1,  ...,n.  Note  that  Pu^  <t>  \  =  1, 

Pu, <'/')■  =  0,  for  j  >  2.  Also  (/,•  -  Pu,)<J>\  =  0,  (/  -  Pu .)’!>)  —  <t>)  for  3  >  2.  Using  these 
relations  we  obtain 

==  *7  3l  >  2  /or 

£€  {*!,..., 4}  and  4  =  1 
(4-8)  /or  ^  £  (*t 

=  0  otherwise. 


Now  consider  S  G  I2[X  \,...,Xn).  By  Lemma  4.1  { <S>- •  }  forms  a  basis 

of  L2(X i, . . . ,  xn).  Hence  we  can  write 

ji  in 

where 

«it...i»  =  /  S(xi,...,a;n)0]l---^/ii(dxi)---/^(da;n). 

Using  (4.8)  we  obtain  the  following  theorem. 


Theorem  4.3. 


(4.10) 


31=2,1  <t<k 


\\  'jk  —  J  S(x i, . . . ,  x„)<A;;(xtl).  •  •  {j>  !°k{xik)iii[dx i)-  •  •  pn{dxn). 


where 


_ _ _ _ _ u 

Remark  4.1.  Actually  (4.10)  does  not  cover  the  case  k  —  0.  In  this  case  PqS  = 
a  1...1  —  £S. 

Theorem  4.1  gives  a  “coordinatewise”  description  of  Pi1...ik  given  complete  or¬ 
thonormal  systems. 

In  Efron  and  Stein(1981),  Bhargava(1980)  and  Karlin  and  Rinott(1982)  these 
projections  are  given  using  conditional  expectations.  We  will  show  that  two  definitions 
are  the  same.  Let  Ei  :L2(Xx, . . . ,  Xn)  — *I?{Xi, . . . , Xn)  be  defined  by 

Erf  —  J  4>{xi,...,xn)ni[dxi ) 

^  ^  ==  £  I  ®  l )  •  •  •  >  ®t—  1  >  ®«+l  >  *  •  •  >  ®n)* 


Let  I  denote  the  identity  map  in  L2(X Then 

(4.12)  (/  Ei)<p  —  <j)  £  {.$  I  *1»  •  •  • »  *t— 1  >  *i+l> • • • >  *«)• 

Let 

(4.13)  —  Ci°  oGn, 

where  o,  denotes  composition  of  the  maps  and 

=  I  —  E{  if  i  G  {  i}.  t  •  •  •  >  ^'fc  }j 
=  Ei  otherwise. 


Then 

Theorem  4.4. 

(4.14)  Ei1...ik  —  -Pii  ...ifc  • 


Proof:  Since  {  <j)}i  ®  }  forms  a  basis  it  suffices  to  prove  that 

•  •  ®<)  =  ®-  ■  ■  ®*i) 
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for  all  {ji, . .  .,jn).  Let  ijj  =  <j>^{ g>  -  •  •  Then 

—  J  ®-  •  •  ®<f>jnfii{dxi) 


Hence 

Eiip  =  0  if  ji  >  2 


From  this  it  follows  that 

=  *7  if  >2  /or 

^6  4}  ond  /*  =  1 

(4-15)  /or 

=  0  otherwise. 


This  is  identical  to  (4.8).  g 


If  wc  expand  the  right  hand  side  of  (4.13)  we  obtain  the  inclusion-exclusion  pattern 
of  conditional  expectations.  For  example 


(4.16) 


HyS  =  (/-Ei)oU2o...o  EnS 

—  J S /t2(dx2)‘  •  •  Vn{dxn)  -  J  Sni(dxi)--- fin(dxn) 

=  £(S\xy)-£(S). 


H12S  =  (7  -  Et)  o  (/  -  E2)  o  E3  o  . .  •  o  7?nS 
=  E3o  ■■■  O  i5n5  -  El  O  E3  O  •  •  •  O  7t’„S 
(4.17)  —  7^2  °  o  •  •  •  o  EnS  +  Ey  o  7i2  o  •  •  •  o  EnS 

=  £{S  I  *1,  z2)  -  £(5  I  *a)  -  f  (5  I  *i)  +  £(S). 


These  expressions  are  used  as  definitions  of  the  terms  of  ANOVA  type  decomposi¬ 
tion  in  Efron  and  Stein(1981),  Bhargava(1980);  and  Karlin  and  Rinott(1982). 
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using  the  notions  and  notations  of  tensor  analysis  and  multilinear  algebra. 

A  theory  of  tensors  is  developed  in  such  a  way  that  (i)  it  can  be  immediately 
applied  in  computer  programs,  (it)  it  can  be  easily  generalized  to  L2 
spaces. 
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